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ABSTRACT 

V 

Although  much  research  has  been  directed  at  dealing  with  outliers, 

particularly  for  a  N(jj^  of)  distribution,  little  work  has  been  devoted  to 

2- 

estimating  oVwhen  outliers  may  be  present.  This  report  describes  a 

2 

comparison  of  some  proposed  estimators  of  0  _for  the  case  of  data  which, 

'l 

except  for  spurious  observations,  results  from  a  N(u,  o )  distribution. 
All  the  estimators  considered  are  nonadaptive  and  can  accommodate  up  to 
n/2  outliers  from  a  sample  of  size  n.  Monte  Carlo  simulation  was  used 
for  the  comparison  of  MSE,  which  was  selected  as  a  measure  of  performance. 
Interval  estimates  based  on  several  of  the  estimators  are  also  considered. 
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I.  INTRODUCTION 


Much  research  has  been  directed  at  dealing  with  outliers,  particularly 
in  the  case  of  data  which,  except  for  spurious  observations,  results  from 
a  N(m,  a )  distribution.  Major  emphasis  has  been  placed  on  techniques  for 

use  when  y  is  to  be  estimated  while  little  research  has  been  devoted  to 

2  2 
estimating  a .  Primary  research  for  the  estimation  of  o  when  outliers 

may  be  present  has  been  done  by  Guttman  and  Smith  (1971)  and  Johnson, 

McGuire  and  Milliken  (1978).  This  technical  report  describes  the  results 

of  a  small  scale  comparison  of  some  of  their  proposed  variance  estimators 

and  one  additional  estimator.  Both  point  estimates  and  interval  estimates 


are  considered 


II.  DEFINITION  OF  ESTIMATORS 


The  variance  estimators  proposed  by  Guttman  and  Smith  (GS)  were 

2 

developed  for  samples  of  observations  hopefully  all  from  a  N(y,  a  ) 

2 

distribution,  but  with  at  most  one  observation  from  a  N(y  +  ao,  a  )  or 
2 

N(y,  (1  +  b)o  ),  b  >  0.  These  estimators  are  adaptive  in  that  the  value 

of  a  sample  statistic  determines  the  form  of  the  estimator.  Three  different 

adaptive  estimators  or  rules  were  developed:  A-Rule,  W-Rule,  and  S-Rule. 

2 

Each  rule  essentially  uses  s  ,  the  usual  variance  estimator,  with  a  modi¬ 
fied  data  set  or  the  original  data  set  depending  on  whether  or  not  there 
is  sufficient  evidence  to  conclude  that  the  suspect  observation  is  an 
outlier.  When  a  modified  data  set  is  used,  the  A-Rule  (Anscombe’s  method) 
eliminates  the  suspected  outlier,  the  W-Rule  (Wlnsorization)  replaces  the 
suspected  outlier  with  the  nearest  retained  observation  and  the  S-Rule 
(Semi-Winsorization)  replaces  the  suspected  outlier  with  the  critical  value 
for  the  rule. 

The  estimators  proposed  by  Johnson,  McGuire  and  Milliken  (JMM)  are 

2 

also  for  samples  from  a  N(u,  a  )  distribution,  but  with  as  many  as  50%  of 

2 

the  observations  arising  from  a  N(y  +  ao,  o  ).  Although  JMM  discussed 

several  forms  for  a  variance  estimator,  only  their  estimator  will  be 

considered  here.  This  estimator,  like  those  of  GS,  was  developed  as  a 
2 

modification  of  s  .  Consider  a  sample  x, ,  x„...x  ,  let 

1  2  n 


*  |xi  -  Xj  |  for  i  <  j  -  2,  3...n 


and  let  u.,.  >  u.„v  >...>  u 

(1)  ~  (2)  -  -  (n(n  -  l)/2) 


-2- 


2 

be  the  ordered  u^'s.  Then  s  can  be  written  as 

,  n(n  -  l)/2  2 

s  -  I  u  .  ./n(n  -  1). 

1  U; 

2 

V^,  which  assumes  k  outliers  In  a  sample  of  size  n,  modifies  s  by 
eliminating  the  k(n  -  k)  largest  u^'s. 

Thus, 

n(n  -  l)/2  2 

V  -  Z  u  >  ./[k(k  -  1)  +  (n  -  k)(n  -  k  -  1)]. 

k  k(n  -  k)  +  1  U; 

is  not  adaptive  but  rather  requires  the  experimenter  to  specify  the 
suspected  number  of  outliers  (k). 

Assuming  the  same  framework  as  JMM  (nonadaptive  estimators  for  use 

2 

with  samples  where  the  outliers  result  from  a  N(y  +  acr,  0  )  distribution), 

the  principles  of  the  GS  rules  can  be  extended  to  accommodate  samples 

with  up  to  50%  outliers.  Here,  as  with  the  estimator,  the  experimenter 

would  specify  the  number  of  suspected  outliers. 

Under  the  assumption  that  the  outliers  are  from  a  normal  distribution 

with  shifted  mean,  the  suspected  outliers  must  be  either  the  largest  or 

smallest  k  observations.  In  practice  the  experimenter  may  know  whether 

the  largest  or  smallest  observations  are  suspect.  However,  it  will  be 

assumed  that  in  general  this  must  be  determined.  Thus,  one  of  two  possible 

groupings  must  be  selected:  the  smallest  k  as  outliers  and  the  remaining 
2 

n  -  k  as  N(u,  o  )  or  the  largest  k  as  outliers  and  the  remaining  n  -  k  as 
2 

N(y,  a  ).  The  ratio  of  the  between-sum-of-squares  to  the  within-sum-of- 
squares  (B/W),  for  identifying  clusters  from  normal  distributions  with  the 
same  variance  but  different  means,  can  be  used  for  this  purpose.  [See 
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Engelman  and  Harcigan  (1969).]  This  ratio  is  determined  for  each  of  the 
two  partitions,  {k,  n  -  k}  and  {n  -  k,  k).  The  grouping  with  the  maximum 
B/W  represents  the  most  likely  clustering  of  samples  from  two  normal 
distributions. 

The  case  with  an  even  sample  size  where  n/2  outliers  are  assumed 

requires  special  consideration.  Here  there  is  only  one  possible  grouping 

{n/2,  n/2)  and  thus  the  B/W  approach  for  Identifying  the  set  of  outliers 

can  not  be  applied.  If  there  actually  were  n/2  outliers,  then  under  the 

assumption  of  equal  variance  for  the  two  distributions  it  would  not  matter 

which  half  of  the  sample  was  designated  to  be  the  outliers  as  either  set 

2 

could  be  used  to  obtain  an  estimate  of  a  .  However,  if  the  assumption  of 

n/2  outliers  is  incorrect,  then  the  "mixed"  half,  that  containing  some 

2 

observations  from  a  N(y,  a  )  and  some  outliers,  would  have  a  larger  sample 

2 

variance  and  would  lead  to  an  overestimation  of  o  .  Therefore,  for  this 
special  case  the  sample  variances  are  calculated  for  each  half  of  the  data 
set  and  that  half  with  the  larger  sample  variance  is  labeled  as  the  set  of 
outliers. 

Having  determined  which  group  of  k  observations  to  consider  as  the 

outliers,  the  extensions  for  two  of  the  GS  rules  are  straightforward.  A^ 

corresponding  to  the  A-Rule  eliminates  the  k  suspected  outliers  and  calculates 
2 

8  based  on  the  remaining  n  -  k  observations.  W^  corresponding  to  the 

W-Rule  replaces  the  k  suspected  outliers  with  the  closest  retained  observa- 

2 

tion  and  calculates  s  using  the  modified  sample  of  n  observations.  Because 
a  critical  value  has  not  been  specified,  an  extension  to  the  S-Rule  is  not 
so  easily  defined.  Due  to  the  preliminary  nature  of  this  study,  no  attempt 
was  made  to  define  S  . 
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An  additional  estimator  for  cr  considered  in  this  study  is  the  pooled 

sample  variance  for  the  group  of  k  assumed  outliers  and  the  group  of  the 

remaining  n  -  k  observations.  This  estimator  will  be  denoted  by  P  . 

k 
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III.  MONTE  CARLO  COMPARISONS 


A  Monte  Carlo  simulation  was  conducted  as  a  means  for  comparing  the 
different  nonadaptive  variance  estimators,  which  require  that  the 
number  of  outliers  must  be  specified.  In  addition  to  evaluating  the 
estimators  using  the  correct  number  of  outliers,  the  simulation,  which  was 
set  up  to  parallel  that  done  by  JMM,  looks  at  the  performance  of  the 
estimators  in  several  cases  where  the  number  of  outliers  is  specified 
incorrectly. 

The  factors  considered  in  the  simulation  were: 

n  -  the  sample  size 

k^  -  the  number  of  actual  outliers 

a  -  the  bias  in  the  mean  (in  units  of  a) 

k  -  the  assumed  number  of  outliers 

and  N  -  the  number  of  random  samples  simulated. 

Given  a  sample  size  (n),  an  actual  number  of  outliers  (k^),  and  a  bias  (a), 

N  random  samples  were  generated  with  n  -  k^  observations  from  the  N(0,  1) 

and  k^  observations  from  the  N(a,  1).  For  each  sample  the  values  of 

P,  and  V  were  determined  for  all  k  included  in  the  study.  Based  on  these 
k  k 

simulated  values  an  estimate  of  the  expected  value  and  MSE  for  each  estimator 

was  obtained.  The  results  of  this  simulation  can  be  easily  transformed  to 

2 

the  appropriate  values  for  sampling  from  a  N(y,  a  )  with  outliers  from  a 
2 

N(y  +  aa,  a  ) . 

Due  to  the  exploratory  nature  of  this  study  not  all  the  cases  included 
in  the  JMM  paper  were  considered.  Only  sample  sizes  of  ten  and  twenty-five 


V  Wk’ 
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were  used,  and  for  each  of  these  only  a  subset  of  the  JMM  cases  were 
completed.  The  cases  Included  In  the  study  are  outlined  in  Figure  1. 

The  number  of  random  samples  generated  for  each  case  was  the  same  as 
that  used  by  JMM  (10, 000/ n).  This  number  of  samples  provides  reasonable 
confidence  limits  for  the  estimates  in  an  acceptable  amount  of  computer 
time. 
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Samples  With  n  =  10  Observations: 

kx  =  1,  2,  3,  5 
a  =  3.0,  6.0,  9.0 
k  =  1,  2,  3,  5 


Samples  with  n  *  25  Observations: 

k^  =  0,  1,  2,  3,  5,  7,  10,  12 
a  «  3.0,  9.0 

k  =  12 


k^  is  the  number  of  actual  outliers, 
a  is  the  bias  in  the  mean  (  in  units  of  a) , 
and  k  is  the  assumed  number  of  outliers. 

Figure  1:  Simulation  Cases  Completed 
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IV.  RESULTS  OF  THE  SIMULATION 


The  expected  values  and  mean  square  errors  obtained  from  the 

simulation  are  presented  In  Figures  2  to  5.  The  estimated  maximum 

standard  error  (In  percentage)  observed  for  each  estimate  Is  Included 

to  provide  an  indication  of  the  accuracy  of  the  simulation.  As  a  check 

2 

on  the  simulation  procedure  the  values  for  and  s  were  calculated  and 
are  presented  in  the  tables.  These  values  are  in  accordance  with  those 
presented  by  JMM. 

For  some  cases  it  is  also  possible  to  check  the  expected  values  of 
the  A,  W  and  P  estimators  obtained  from  the  simulation.  When  the  bias  is 
large  (e.g.,  9.0),  it  is  reasonable  to  assume  that  the  simulated  outliers 
are  larger  than  the  N(0,  1)  observations.  Thus,  the  ordered  observations 
can  be  split  into  two  groups,  the  largest  k^  making  up  a  random  sample 
from  N(9,  1)  and  the  smallest  n  -  k^  making  up  a  random  sample  from  N(0,  1) . 
The  tabulated  values  of  the  moments  of  the  order  statistics  for  samples 
from  the  normal  distribution  [Sarhan  and  Greenberg  (1962)]  can  then  be 
used  to  obtain  the  expected  values  of  the  A^,  W^  and  estimators. 
Unfortunately,  the  moments  needed  to  calculate  the  MSE  for  these  estimators 
have  not  been  tabulated.  The  results  of  the  analytical  check  for  samples 
of  ten  observations  containing  one  outlier  with  bias  of  9.0  are  presented 
in  Figure  6. 

The  MSE  was  used  to  compare  the  performance  of  the  estimators.  From 
Figures  2  to  4  it  can  be  seen  that  is  the  best  estimator  when  the  number 
of  outliers  is  underestimated.  When  the  number  of  outliers  is  overestimated 
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Figure  2:  Estimated  Expected  Values  and  Mean  Square  Errors 
For  Sample  Size  n  =  10  and  Bias  a  -  3.0 
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Figure  3:  Estimated  Expected  Values  and  Mean  Square  Errors 
For  Sample  Size  n  *  10  and  Bias  a  *  6.0 
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Figure  5:  Estimated  Expected  Values  aiu 
For  Sample  Size  n  *  25  and  k 


and  the  bias  is  small  (a  *  3.0),  P^  has  the  smallest  MSE.  However,  for 
larger  bias  (a  ■  6.0  or  a  -  9.0),  Pfc  breaks  down.  As  can  be  seen  from 
Figures  3,  4  and  5  no  single  estimator  is  unconditionally  superior  when 
the  bias  is  large  and  the  number  of  outliers  is  overestimated. 

Despite  the  breakdown  of  P^  with  large  biases,  its  exceptional  per¬ 
formance  with  a  small  bias  suggests  that  a  modification  of  the  estimator 
should  be  considered.  A  simplistic  form  of  an  adaptive  rule  to  replace 
was  defined  and  briefly  evaluated.  This  modified  pooled  estimator  is 

m  .  |  \  u  <  h  -  v- 

k  j  A^  otherwise 
o 

where  s*  is  the  sample  variance  of  the  k  suspected  outliers 
2 

and  Sj  is  the  sample  variance  of  the  n  -  k  remaining  observations. 

The  MSE  of  this  modified  estimator  is  bounded  by  that  of  Pfc  and 

This  estimator  was  evaluated  for  sample  sizes  of  25  assuming  twelve 
outliers  and  a  »  .05.  The  performance  of  P^  and  MPi2  is  Presented  in 
Figure  7.  It  can  be  seen  that  the  MPk  rule  does  not  do  as  well  as  Pfc  in 
the  case  of  a  small  bias;  however,  it  does  much  better  than  P^  with  large 
biases.  Thus,  even  a  very  simple  adaptive  rule  can  greatly  improve  an 
estimate.  This  suggests  that  adaptive  rules  should  be  further  investigated, 
particularly  rules  which  would  not  require  specification  of  the  number  of 
outliers. 

When  assuming  twelve  outliers  with  large  bias  in  a  sample  of  twenty-five 
observations,  the  MSE  values  (Figure  5)  display  some  peculiar  results  which 
warrant  explanation.  Specifically,  there  is  an  unexpected  jump  in  the  MSE 
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Figure  7:  Estimated  Expected  Values  and  Mean  Square  Errors 
For  Sample  Size  n  *  12  and  k  -  12 
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of  the  A  estimator  for  the  case  of  one  actual  outlier  and  an  unexpected 

jump  in  the  MSE  of  the  W  estimator  for  the  case  of  twelve  actual  outliers. 

The  problem  with  the  A  estimator  is  a  result  of  the  method  used  to  identify 

the  set  of  outliers.  When  there  is  actually  only  one  outlier  but  twelve 

outliers  are  assumed  the  B/W  ratio  for  the  two  groupings  {12,  13}  and 

{13,  12}  are  very  close  in  value;  thus  the  wrong  set  of  twelve  observations 

is  sometimes  designated  as  the  outliers.  When  the  bias  in  the  outliers  is 

large  and  the  wrong  set  of  observations  (l.e.,  the  set  containing  the  outlier) 

2 

is  used,  the  A  estimator  overestimates  0  leading  to  a  large  MSE.  Once  again 
this  points  out  the  need  for  adaptive  rules  which  do  not  require  the  specifi¬ 
cation  of  the  number  of  outliers. 

The  large  MSE  for  the  estimator  when  twelve  outliers  with  a  large 
bias  are  present  is  not  a  result  of  incorrectly  identifying  the  outliers 
but  rather  it  is  an  inherent  property  of  the  estimator.  In  this  case  the 
B/W  ratio  method  correctly  selects  the  set  of  twelve  outliers  and  the  W 
estimator  is  applied  to  the  sample  of  thirteen  observations  from  a  N(0,  1). 


Depending  on  whether  the  bias  is  positive  or  negative,  the  largest  or 
smallest  ordered  observation  or  x^)  is  weighted  heavily  by  the  W 

estimator.  The  heavy  weighting  of  this  extreme  observation  can  lead  to 


overestimation  of  0  and  consequently  a  large  MSE.  Based  on  these  results 


the  W  estimator  appears  to  be  inappropriate  when  outliers  make  up  a  large 
proportion  of  the  sample. 


USING  THE  NONADAPTIVE  ESTIMATORS 


Due  Co  Che  bias  of  Che  estimator,  V^,  and  Che  difficulCy  of  specify¬ 
ing  Che  number  of  ouCliers,  JMM  suggesCed  ChaC  should  be  used  in 

conjuncCion  wich  an  esCimaCor  V.  *.  V  *  is  simply  V  modified  Co  be  unbiased 

K  R 

2 

in  Che  case  of  no  ouCliers.  ThaC  is,  V^*  <=  Vfc/vk  where  ■  ECVfc/a  )  given 
no  ouCliers. 

2 

When  Che  number  of  ouCliers  is  overesCimaCed,  V  under esCimaces  o 

k 

o 

whereas  V  *  overesCimaCes  0  .  JMM  suggesCed  ChaC  when  ouCliers  are  suspecced, 
k 

2 

(V  ,  V  *)  should  be  used  as  an  inCerval  estimate  for  o  ,  where  L  is  an  upper 
I.  L 

bound  for  Che  number  of  ouCliers  in  Che  sample.  If  Che  experimenter  has  no 

esCimaCe  for  L,  JMM  propose  using  n/2. 

IC  can  be  seen  from  Figures  2  Chrough  5  ChaC  when  Che  number  of  ouCliers 

2 

is  overesCimaCed,  A^  and  W  also  underesdmate  o  .  Thus,  Che  inCervals 

k*  k 

2 

(A  ,  A  *)  and  (W  ,  W  *)  could  also  serve  as  inCerval  esCimaces  for  a  .  These 
L  L  L  L 

inCerval  esCimaces  were  evaluaCed  for  a  sample  size  of  Cen  wich  five  assumed 

ouCliers  and  for  a  sample  size  of  CwenCy-five  wich  twelve  assumed  ouCliers. 

The  expected  values  of  V,  ,  A  and  W  in  Che  null  case,  which  are  needed  Co 

k  k 

calculate  V^*,  A^*,  W^*,  A^*  and  W^2*f  were  obtained  by  Monte  Carlo 

simulation.  These  values  are  0.141,  0.224,  0.176,  0.141,  0.300  and  0.256, 

respectively.  The  interval  estimates  are  presented  in  Figures  8  and  9.  A 

comparison  of  interval  lengths  reveals  ChaC  for  two  or  more  actual  ouCliers, 

2 

the  A  intervals  put  Che  tightest  bounds  on  o  ,  followed  by  the  W  and  then  Che 
V  intervals. 


-18- 


Bias  (V-,  V-*)  Length  (A  ,  A  *)  Length  (W,.,  W  *)  Length 


o 

on 

3 

CM 

00 

oo 

sf 

CM 

SO 

«n 

on 

sO 

CO 

UO 

CO 

H 

O' 

<n 

^3 

vO 

sr 

CO 

m 

HO 

m 

r- 

ON 

rH 

O 

CM 

SO 

sD 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

H 

H 

rH 

rH 

H 

rH 

rH 

CM 

CM 

CM 

CM 

CM 

/■> 

/> 

/-> 

/-N 

/> 

/T'k 

✓"N 

/-N 

«3 

CM 

00 

O 

u0 

CM 

sO 

CO 

O 

a\ 

o 

on 

ON 

CM 

00 

o o 

CO 

00 

rH 

co 

rH 

SO 

oo 

00 

a\ 

o 

rH 

co 

vO 

sf 

CM 

CM 

• 

• 

• 

• 

• 

m 

• 

• 

• 

• 

• 

• 

rH 

rH 

rH 

rH 

CM 

CM 

CM 

CM 

CM 

CM 

co 

CO 

sf 

•k 

CO 

•k 

sf 

«» 

00 

* 

* 

•i 

CO 

Sk 

•k 

•k 

o 

«k 

uo 

00 

CO 

CO 

CO 

vO 

oo 

rH 

SO 

co 

vO 

CM 

CO 

co 

CO 

CO 

co 

^3 

sf 

<r 

^3 

m 

ho 

s-/ 

v-/ 

w 

v 

N-/ 

ho 

o 

as 

rH 

vO 

00 

uo 

co 

CO 

rH 

OS 

o 

co 

Os 

vO 

O 

00 

o 

O 

o 

CM 

uo 

*3 

u"i 

so 

sO 

o 

00 

as 

CM 

CM 

■ 

• 

• 

• 

• 

0 

■ 

• 

• 

• 

• 

• 

rH 

rH 

rH 

rH 

rH 

rH 

rH 

CM 

rH 

rH 

CM 

CM 

•/-N 

/-s 

r-N 

/-v 

/-N 

CO 

UN 

as 

CO 

CM 

ON 

r>. 

HO 

t*k 

HO 

ON 

ON 

<3 

m 

CM 

n 

rH 

r^. 

p>. 

00 

<*> 

HO 

<n 

<n 

sO 

ON 

ON 

as 

rH 

rH 

CM 

to 

*3 

sf 

00 

00 

• 

• 

• 

0 

• 

0 

• 

• 

0 

• 

• 

• 

rH 

rH 

rH 

rH 

CM 

CM 

CM 

CM 

CM 

CM 

CM 

CM 

•k 

OO 

0k 

OO 

«k 

CM 

0k 

CO 

•k 

(O 

•» 

00 

0k 

O 

0k 

ON 

0k 

m 

0k 

o 

«k 

VO 

* 

vO 

vO 

CO 

CO 

CO 

00 

rH 

sf 

U0 

CO 

en 

CO 

<3- 

*3 

-«■ 

sf 

U0 

m 

HO 

HO 

SO 

SO 

w 

w 

■w 

rH 

r-N 

uo 

CO 

HO 

*3 

rH 

O 

co 

r-*. 

H0 

o 

vO 

rv 

SO 

r*. 

SO 

sO 

O 

O 

00 

CM 

co 

CO 

fN* 

rH 

CM 

sO 

HO 

O 

r** 

o 

• 

0 

• 

• 

• 

• 

• 

0 

• 

• 

• 

• 

rH 

rH 

rH 

rH 

CM 

CM 

CM 

CO 

CO 

co 

H0 

sO 

/“S 

o'"* 

/—v 

✓“N 

/-N 

/-v 

/-"v 

00 

CO 

ON 

•3 

CM 

•3 

HO 

rH 

00 

CM 

co 

OO 

so 

o 

00 

vO 

co 

CM 

vO 

ON 

r^. 

00 

CM 

r>» 

*3* 

vO 

HO 

O 

HO 

vO 

00 

rH 

o 

HO 

o 

• 

• 

0 

• 

• 

• 

• 

• 

• 

• 

• 

• 

rH 

rH 

rH 

CM 

CM 

CM 

CM 

sf 

«3 

CO 

sO 

r«* 

0k 

sO 

0k 

*3 

0k 

r* 

•k 

o 

«k 

>3 

0k 

rH 

•* 

HO 

HO 

•k 

00 

«k 

00 

o 

CM 

CM 

ON 

in 

f"- 

O 

ON 

fx 

o 

-3 

ON 

CM 

CM 

CM 

CM 

co 

CO 

*3 

HO 

HO 

H> 

ON 

ON 

w/ 

>-✓ 

N-* 

o 

o 

o 

© 

o 

o 

o 

o 

o 

o 

o 

o 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

CO 

sO 

ON 

CO 

vO 

ON 

CO 

vO 

ON 

CO 

vO 

ON 

r4  |  r-t  CN  rO  ut 

I 


Figure  8:  Interval  Estimates  For  o  With  Sample  Size 
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gure  9:  Interval  Estimates  For  a  With  Sample  Size 


VI.  CONCLUSIONS 


This  Monte  Carlo  investigation  has  shown  that  the  performance  of 
the  nonadaptive  estimators  greatly  depends  on  the  assumed  number  of 
outliers  in  relation  to  the  true  number.  This  points  out  the  need  for 
adaptive  procedures  which  could  be  applied  without  requiring  the  experi¬ 
menter  to  estimate  the  number  of  outliers. 

When  nonadaptive  estimators  are  to  be  used,  an  interval  estimation 
procedure  may  prove  useful.  This  Monte  Carlo  study  has  indicated  that 

interval  estimates  based  on  A  or  W.  tend  to  be  superior  (i.e.,  produce 

R  K 

tighter  bounds)  to  the  interval  estimate  suggested  by  JMM. 
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